Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add traces backfill option #615

Merged
merged 6 commits into from
Oct 17, 2024
Merged

Conversation

peterargue
Copy link
Contributor

@peterargue peterargue commented Oct 16, 2024

Closes: #???

Description

The traces downloader does not persist any state to track which blocks have been successfully downloaded. As a result, any time a gateway is rebooted or crashes, there's a possibility that any in-progress trace downloads will be lost.

This PR adds CLI flags to specify an EVM block range to back backfill. When set, the node will kick off a background goroutine that iterates through all blocks in the range, and downloads traces for any transaction that are missing from the db.

Note: the traces engine needs persisted state info to ensure all blocks are downloaded. Since there is already work in progress to generate theses traces locally, we probably do not want to spend the cycles on it now.


For contributor use:

  • Targeted PR against master branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the standards mentioned here.
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

Summary by CodeRabbit

  • New Features

    • Introduced backfilling functionality for transaction traces, allowing users to specify block height ranges for trace data retrieval.
    • Added configuration options for backfilling start and end heights.
    • Enhanced instructions for running the EVM Gateway across different environments.
    • Added a new method for managing trace downloads more effectively.
  • Bug Fixes

    • Enhanced error handling and logging for trace downloading processes.
  • Documentation

    • Updated the README with detailed instructions for building, running, and deploying the EVM Gateway, including new configuration flags.

Copy link
Contributor

coderabbitai bot commented Oct 16, 2024

Walkthrough

The pull request introduces several updates across multiple files, primarily enhancing the README.md documentation for the EVM Gateway project. Key additions include new configuration flags for backfilling transaction traces and expanded instructions on running the EVM Gateway in various environments. Additionally, modifications are made to the ProfileServer method receiver naming for consistency, and the StartTraceDownloader method in the Bootstrap struct is enhanced to incorporate new logic for handling trace data backfilling. A new Backfill method is also added to the Engine struct for improved trace management.

Changes

File Change Summary
README.md Added new configuration flags traces-backfill-start-height and traces-backfill-end-height. Expanded sections on running and debugging the EVM Gateway.
api/profiler.go Updated method receiver name in ListenAddr from h to s.
bootstrap/bootstrap.go Enhanced StartTraceDownloader method with logic for backfilling trace data based on new config parameters.
services/traces/engine.go Added Backfill method for downloading traces in a specified height range. Updated indexBlockTraces to include skipExisting parameter.
services/traces/engine_test.go Modified TestTraceIngestion to remove downloadedHashes map and ensure correct processing of trace ingestion.

Possibly related PRs

Suggested labels

EVM, Documentation

Suggested reviewers

  • m-Peter
  • sideninja
  • zhangchiqing

🐰 In the meadow, changes bloom,
New flags added, dispelling gloom.
With traces backfilled, we hop with cheer,
EVM Gateway shines bright and clear!
Commands now guide, with ease to deploy,
A rabbit's joy, oh what a joy! 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 378e60f and 03d06c5.

📒 Files selected for processing (1)
  • services/traces/engine_test.go (0 hunks)
💤 Files with no reviewable changes (1)
  • services/traces/engine_test.go

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@@ -163,6 +163,26 @@ func (b *Bootstrap) StartTraceDownloader(ctx context.Context) error {
)

StartEngine(ctx, b.traces, l)

if b.config.TracesBackfillStartHeight > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we prevent from backfilling it over and over again after every restart?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be done by completely removing the provided values for these 2 flags, upon restarting, or simply by shifting their values to not account for past heights, which we know were back-filled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's better that we could somehow remember that we have back-filled, and skip it next time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but soon enough the traces will be generated on demand from the EVM Gateway, so there will be no need for downloading from a GCP bucket. I see this merely as a band-aid to unblock partners that rely on the missing traces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we would just remove the flag from the config after starting it up. We could build a more intelligent solution, but I think this is going away in the next couple months so not worth spending much time on it.

worst case it will scan for all of the traces, and skip downloading them because they already exist in the db

| `force-start-height` | `0` | Force-set starting Cadence height (local/testing use only) |
| `wallet-api-key` | `""` | ECDSA private key for wallet APIs (local/testing use only) |
| `filter-expiry` | `5m` | Expiry time for idle filters |
| `traces-gcp-bucket` | `""` | GCP bucket name for transaction traces |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed something that could be problematic, though not really related to this PR.
The traces-gcp-bucket accepts a single bucket name, but for mainnet we already have 2:

  • mainnet26-evm-execution-traces1
  • mainnet25-evm-execution-traces1

We need to make sure that when we back-fill, the provided heights exist in the appropriate bucket, but I don't know what is contained in those 2 buckets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a check that will log a warning if the start height is before the init-cadence height (378e60f). this may not work for nodes that have been up for a while, but not sure what else we can do without adding a bunch of complexity to the startup logic

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense 👍 Not much we can do for the time being. Just thought of mentioning it.

Copy link
Collaborator

@m-Peter m-Peter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (5)
config/config.go (2)

90-93: LGTM! Consider clarifying the default behavior.

The new fields TracesBackfillStartHeight and TracesBackfillEndHeight are well-defined and properly typed. The comments are clear, but could be improved by explicitly stating the default behavior when these values are not set.

Consider updating the comments to clarify the default behavior:

- // TracesBackfillStartHeight sets the starting block height for backfilling missing traces.
+ // TracesBackfillStartHeight sets the starting block height for backfilling missing traces. If 0, backfilling starts from the earliest block.
  TracesBackfillStartHeight uint64
- // TracesBackfillEndHeight sets the ending block height for backfilling missing traces.
+ // TracesBackfillEndHeight sets the ending block height for backfilling missing traces. If 0, backfill continues to the latest block.
  TracesBackfillEndHeight uint64

319-322: LGTM! Consider adding a comment for clarity.

The validation check for trace backfill heights is correctly implemented. It ensures that the start height is less than the end height when both are set, which is crucial for preventing invalid configurations.

Consider adding a brief comment to explain the purpose of this check:

+ // Validate trace backfill height range when both start and end heights are specified
  if cfg.TracesBackfillStartHeight > 0 && cfg.TracesBackfillEndHeight > 0 && cfg.TracesBackfillStartHeight > cfg.TracesBackfillEndHeight {
    return nil, fmt.Errorf("traces backfill start height must be less than the end height")
  }
README.md (3)

175-211: LGTM! Consider adding a note about the relationship between the new flags.

The addition of traces-backfill-start-height and traces-backfill-end-height flags is consistent with the PR objectives. The descriptions are clear and concise.

Consider adding a note in the description of these flags to clarify their relationship. For example:
"Note: Both start and end heights must be set for backfilling to occur."


Line range hint 82-134: LGTM! Consider adding explanations for some terms and values.

The new "Running on Testnet" section provides clear instructions and a helpful example command. It's a valuable addition to the documentation.

Consider the following improvements:

  1. Explain what COA stands for and its significance in the context of the EVM Gateway.
  2. Provide a brief explanation of the --init-cadence-height value (211176670) used in the example command.
  3. Add a note about the importance of keeping the private key (--coa-key) secure and not sharing it.
🧰 Tools
🪛 LanguageTool

[duplication] ~213-~213: Possible typo: you repeated a word
Context: ... | # Deploying Deploying the EVM Gateway node comes with some pr...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~214-~214: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...me prerequisites as well as expectations and they are best explained in the WIP docu...

(COMMA_COMPOUND_SENTENCE)

🪛 Markdownlint

214-214: null
Bare URL used

(MD034, no-bare-urls)


Line range hint 286-303: LGTM! Consider adding context for when to use profiling.

The new "Debugging" section provides clear instructions for enabling and using the profiler. This is a valuable addition for developers working on or troubleshooting the EVM Gateway.

Consider adding a brief explanation of when a developer might need to use profiling. For example:
"Profiling can be useful when diagnosing performance issues or optimizing the EVM Gateway's resource usage."

🧰 Tools
🪛 LanguageTool

[duplication] ~213-~213: Possible typo: you repeated a word
Context: ... | # Deploying Deploying the EVM Gateway node comes with some pr...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~214-~214: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...me prerequisites as well as expectations and they are best explained in the WIP docu...

(COMMA_COMPOUND_SENTENCE)

🪛 Markdownlint

214-214: null
Bare URL used

(MD034, no-bare-urls)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between f29d065 and 378e60f.

📒 Files selected for processing (5)
  • README.md (1 hunks)
  • api/profiler.go (1 hunks)
  • bootstrap/bootstrap.go (1 hunks)
  • config/config.go (3 hunks)
  • services/traces/engine.go (3 hunks)
🧰 Additional context used
🔇 Additional comments (8)
api/profiler.go (1)

33-34: Approved: Improved receiver naming consistency

The change from h to s for the method receiver improves code consistency within the ProfileServer struct. This aligns with Go's best practices for using short, descriptive names for receivers and matches the naming convention used in other methods of this struct. While this change doesn't affect functionality, it enhances code readability and maintainability.

config/config.go (2)

165-166: LGTM! New flags are well-defined.

The new command-line flags traces-backfill-start-height and traces-backfill-end-height are correctly implemented. They are properly typed as uint64, have appropriate default values, and include clear descriptions.


90-93: Summary: Trace backfilling configuration successfully implemented

The changes introduce new configuration options for trace backfilling, including:

  1. New Config struct fields for start and end heights
  2. Corresponding command-line flags
  3. Validation to ensure the start height is less than the end height

These additions align well with the PR objectives, providing a mechanism to persistently track which blocks have been successfully downloaded. The implementation is clean, consistent with existing code, and includes necessary validations.

To further improve the code:

  1. Consider clarifying the default behavior in the struct field comments
  2. Add a brief comment explaining the purpose of the validation check

These minor enhancements will increase code readability and maintainability.

Also applies to: 165-166, 319-322

README.md (1)

Line range hint 1-307: Great improvements to the documentation!

The changes to the README.md file significantly enhance the documentation for the EVM Gateway project. The additions include:

  1. New configuration flags for backfilling transaction traces.
  2. Detailed instructions for running the EVM Gateway on testnet.
  3. A new debugging section with profiling instructions.

These updates align well with the PR objectives and provide valuable information for developers working with the EVM Gateway. The documentation is now more comprehensive and user-friendly.

🧰 Tools
🪛 LanguageTool

[duplication] ~213-~213: Possible typo: you repeated a word
Context: ... | # Deploying Deploying the EVM Gateway node comes with some pr...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~214-~214: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...me prerequisites as well as expectations and they are best explained in the WIP docu...

(COMMA_COMPOUND_SENTENCE)

🪛 Markdownlint

214-214: null
Bare URL used

(MD034, no-bare-urls)

services/traces/engine.go (4)

87-87: Appropriate usage of 'skipExisting=false' in Notify method call.

The call to indexBlockTraces now includes the skipExisting parameter set to false, ensuring that new traces are downloaded without skipping any existing ones during normal operation.


91-91: Addition of 'skipExisting' parameter enhances trace indexing flexibility.

The updated signature of indexBlockTraces now accepts a skipExisting boolean parameter, allowing the method to conditionally skip downloading traces that already exist. This improves the method's versatility for different use cases.


110-112: Enhanced logging with 'evm-height' for better traceability.

Including evm-height in the logging context improves the ability to trace and debug issues related to specific EVM block heights.


114-120: Efficiently skipping existing traces during backfill.

The conditional check for skipExisting optimizes the backfill process by avoiding unnecessary downloads of traces that are already present. This enhances performance and reduces redundant operations.

Comment on lines +175 to +178
if err != nil {
l.Error().Err(err).Msg("failed to get block by height")
return
}
Copy link
Contributor

@coderabbitai coderabbitai bot Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Modify error handling to continue backfilling despite retrieval errors.

Currently, the backfill process exits if it encounters an error when retrieving a block or its Cadence ID (lines 175-178 and 185-188). This means that an error on a single block halts the entire backfill process. To improve robustness, consider logging the error and continuing with the next block instead of returning.

Apply this diff to adjust the error handling:

--- a/services/traces/engine.go
+++ b/services/traces/engine.go
@@ -176,7 +176,7 @@
     if err != nil {
         l.Error().Err(err).Msg("failed to get block by height")
-        return
+        continue
     }

@@ -186,7 +186,7 @@
     if err != nil {
         l.Error().Err(err).Msg("failed to get cadence block ID")
-        return
+        continue
     }

Also applies to: 185-188

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning is intentional. if these blocks don't exist, it's unlikely future blocks will

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterargue, understood. If returning is intentional due to future blocks being unlikely to exist, no changes are necessary.

(_/)
( •_•)
(>♥<)


✏️ Learnings added
Learnt from: peterargue
PR: onflow/flow-evm-gateway#615
File: services/traces/engine.go:175-178
Timestamp: 2024-10-17T18:05:25.775Z
Learning: In the trace backfill process, returning upon encountering an error when retrieving a block or its Cadence ID is intentional because future blocks are unlikely to exist if these blocks don't.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

return
}

e.indexBlockTraces(block, cadenceID, true)
Copy link
Contributor

@coderabbitai coderabbitai bot Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider managing concurrency during backfill to prevent resource exhaustion.

When calling e.indexBlockTraces within the backfill loop, each call potentially spawns multiple goroutines for downloading traces. Iterating over a large block range without controlling concurrency at the backfill level might lead to excessive resource usage. To mitigate this, consider limiting the number of concurrent indexBlockTraces calls or processing blocks in manageable batches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.indexBlockTraces is limits the number of goroutines it will spawn, and blocks until all are complete

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterargue, thank you for the clarification. It's good to know that e.indexBlockTraces limits the number of goroutines and blocks until all are complete. That addresses my concern about potential resource exhaustion during backfill.

🐰😔


✏️ Learnings added
Learnt from: peterargue
PR: onflow/flow-evm-gateway#615
File: services/traces/engine.go:190-190
Timestamp: 2024-10-17T18:04:40.702Z
Learning: In `services/traces/engine.go`, the `e.indexBlockTraces` function limits the number of goroutines to a maximum of 5 and blocks until all are complete, so during backfill, concurrency is already managed to prevent resource exhaustion.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +167 to +197
if b.config.TracesBackfillStartHeight > 0 {
startHeight := b.config.TracesBackfillStartHeight
if _, err := b.storages.Blocks.GetByHeight(startHeight); err != nil {
return fmt.Errorf("failed to get provided start height %d in db: %w", startHeight, err)
}

cadenceStartHeight, err := b.storages.Blocks.GetCadenceHeight(startHeight)
if err != nil {
return fmt.Errorf("failed to get cadence height for backfill start height %d: %w", startHeight, err)
}

if cadenceStartHeight < b.config.InitCadenceHeight {
b.logger.Warn().
Uint64("evm-start-height", startHeight).
Uint64("cadence-start-height", cadenceStartHeight).
Uint64("init-cadence-height", b.config.InitCadenceHeight).
Msg("backfill start height is before initial cadence height. data may be missing from configured traces bucket")
}

endHeight := b.config.TracesBackfillEndHeight
if endHeight == 0 {
endHeight, err = b.storages.Blocks.LatestEVMHeight()
if err != nil {
return fmt.Errorf("failed to get latest EVM height: %w", err)
}
} else if _, err := b.storages.Blocks.GetByHeight(endHeight); err != nil {
return fmt.Errorf("failed to get provided end height %d in db: %w", endHeight, err)
}

go b.traces.Backfill(startHeight, endHeight)
}
Copy link
Contributor

@coderabbitai coderabbitai bot Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add validation to ensure startHeight is less than or equal to endHeight

Currently, there is no check to confirm that startHeight is less than or equal to endHeight. If startHeight is greater than endHeight, the backfill process may not function as intended or could result in errors.

Consider adding the following validation after determining endHeight:

if startHeight > endHeight {
    return fmt.Errorf("startHeight (%d) cannot be greater than endHeight (%d)", startHeight, endHeight)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is checked in another file which runs before this code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterargue, understood, thank you for the clarification. I've noted that the validation for startHeight and endHeight is performed before this code.

(_/)
(°_°)?


✏️ Learnings added
Learnt from: peterargue
PR: onflow/flow-evm-gateway#615
File: bootstrap/bootstrap.go:167-197
Timestamp: 2024-10-17T18:04:03.861Z
Learning: In the `flow-evm-gateway` Go project, the validation ensuring that `startHeight` is less than or equal to `endHeight` is performed before the `StartTraceDownloader` method in `bootstrap/bootstrap.go`, so additional checks in this method are unnecessary.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@peterargue peterargue merged commit 5872cfb into main Oct 17, 2024
2 checks passed
@peterargue peterargue deleted the petera/add-traces-backfill-mode branch October 17, 2024 20:06
This was referenced Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

4 participants